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Abstract 

We give a complete algorithm and source code for constructing what we 
refer to as heterotic risk models (for equities), which combine: i) granularity 
of an industry classification; ii) diagonality of the principal component factor 
covariance matrix for any sub-cluster of stocks; and iii) dramatic reduction of 
the factor covariance matrix size in the Russian-doll risk model construction. 
This appears to prove a powerful approach for constructing out-of-sample 
stable short-lookback risk models. Thus, for intraday mean-reversion alphas 
based on overnight returns, Sharpe ratio optimization using our heterotic risk 
models sizably improves the performance characteristics compared to weighted 
regressions based on principal components or industry classification. We also 
give source code for: a) building statistical risk models; and ii) Sharpe ratio 
optimization with homogeneous linear constraints and position bounds. 
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1 Introduction 


When the number of stocks in a portfolio is large and the number of available 
(relevant) observations in the historical time series of returns is limited - which 
is essentially a given for short-horizon quantitative trading strategies - the sample 
covariance matrix (SCM) is badly singular. This makes portfolio optimization - e.g., 
Sharpe ratio maximization - challenging as it requires the covariance matrix to be 
invertible. A standard method for circumventing this difficulty is to employ factor 
models,^ which, instead of computing SCM for a large number of stocks, allow to 
compute a factor covariance matrix (FCM) for many fewer risk factors. However, 
the number of relevant risk factors itself can be rather large. E.g., in a (desirably) 
granular industry classihcation (IC), the number of industries^ can be in 3 digits for 
a typical (liquid) trading universe. Then, even the sample FCM can be singular. 

In (Kakushadze, 2015c) a simple idea is set forth: model FCM itself via a factor 
model, and repeat this process until the remaining FCM is small enough and can be 
computed. In fact, at the end of this process we may even end up with a single factor, 
for which “FCM” is simply its variance.^ This construction - termed as “Russian- 
doll” risk models (Kakushadze, 2015c) - dramatically reduces the number of or 
altogether eliminates the factors for which (off-diagonal) FCM must be computed. 
The “catch” is that at each successive step we must: i) identify the risk factors; and 
ii) compute the specihc (idiosyncratic) risk (ISR) and FCM consistently. 

^ For a partial list of literature related to factor risk models, see, e.g., (Acharya and Pedersen, 

2005) , (Ang et al, 2006), (Anson, 2013/14), (Asness, 1995), (Asness and Stevens, 1995), (Asness et 
al, 2001), (Bai, 2003), (Bai and Li, 2012), (Bai and Ng, 2002), (Bansal and Viswanathan, 1993), 
(Banz, 1981), (Basu, 1977), (Black, 1972), (Black et al, 1972), (Blume and Friend, 1973), (Brandt 
et al, 2010), (Briner and Connor, 2008), (Burmeister and Wall, 1986), (Campbell, 1987), (Campbell 
et al, 2001), (Campbell and Shiller, 1988), (Carhart, 1997), (Chamberlain and Rothschild, 1983), 
(Chan et al, 1985), (Chen et al, 1986, 1990), (Chicheportiche and Bouchaud, 2014), (Cochrane, 
2001), (Connor, 1984,1995), (Connor and Korajczyk, 1988,1989, 2010), (Daniel and Titman, 1997), 
(DeBondt and Thaler, 1985), (Dhrymes et al, 1984), (Fama and French, 1992, 1993, 1996, 2015), 
(Fama and McBeth, 1973), (Ferson and Harvey, 1991, 1999), (Forni et al, 2000, 2005), (Forni and 
Lippi 2001), (Coyal et al, 2008), (Goyal and Santa-Clara, 2003), (Grinold and Kahn, 2000), (Hall 
et al, 2002), (Haugen, 1995), (Heaton and Lucas, 1999), (Heston and Rouwenhorst, 1994), (Jagan- 
nathan and Wang, 1996), (Jegadeesh and Titman, 1993, 2001), (Kakushadze, 2014, 2015a, 2015c), 
(Kakushadze and Liew, 2015), (King, 1966), (Korajczyk and Sadka, 2008), (Kothari and Shanken, 
1997), (Lakonishok et al, 1994), (Lee and Stefek, 2008), (Lehmann and Modest, 1988), (Liew and 
Vassalou, 2000), (Lintner, 1965), (Lo, 2010), (Lo and MacKinlay, 1990), (MacKinlay, 1995), (Mac- 
Queen, 2003), (Markowitz, 1952, 1984), (Menchero and Mitra, 2008), (Merton, 1973), (Miller, 

2006) , (Motta et al, 2011), (Mukherjee and Mishra, 2005), (Ng et al, 1992), (Pastor and Stam- 
baugh, 2003), (Roll and Ross, 1980), (Rosenberg, 1974), (Ross, 1976, 1978a, 1978b), (Scholes and 
Williams, 1977), (Schwert, 1990), (Shanken, 1987, 1990), (Shanken and Weinstein, 2006), (Sharpe, 
1963, 1964), (Stock and Watson, 2002a, 2002b), (Stroyny, 2005), (Treynor, 1999), (Vassalou, 
2003), (Whitelaw, 1997), (Zangari, 2003), (Zhang, 2010), and references therein. 

^ By this we mean the stock clusters at the most granular level in the IC hierarchy. E.g., in 
BICS these would be sub-industries, whereas other ICs have different naming conventions. 

^ Generally, off-diagonal elements of a sample (stock or factor) covariance matrix tend to be 
unstable out-of-sample, whereas its diagonal elements (variances) typically are much more stable. 
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Identifying the risk factors in the Russian-doll construction is facilitated by using 
a binary industry classihcation:® using BICS as an illustrative example, industries 
serve as the risk factors for sub-industries; sectors - there are only 10 of them - serve 
as the risk factors for industries; and - if need be - the “market” serves as the sole 
risk factor for sectors. Correctly computing ISR and FCM is more nontrivial: the 
algorithms for this are generally deemed proprietary. One method in the “lore” is to 
use a cross-sectional linear regression, where the returns are regressed over a factor 
loadings matrix (FLM), and FCM is identihed with the (serial) covariance matrix 
of the regression coefficients, whereas ISR squared is identihed with the (serial) 
variance of the regression residuals. However, as discussed in (Kakushadze, 2015c), 
generally this does not satisfy a nontrivial requirement (which is often overlooked 
in practice) that the factor model reproduce the historical in-sample total variance. 

In this paper we share a complete algorithm and source code for building what we 
refer to as “heterotic” risk models. It is based on a simple observation that, if we use 
principal components (PCs) as FLM, the aforementioned total variance condition 
is automatically satished. Unfortunately, the number of useful PCs is few as it is 
limited by the number of observations, and they also tend to be unstable out-of- 
sample (as they are based on off-diagonal covariances), with the hrst PC being most 
stable. We circumvent this by building FLM from the hrst PCs of the blocks (sub¬ 
matrices) of the sample correlation matrix^ corresponding to - in the BICS language 
- the sub-industries. I.e., if there are N stocks and K sub-industries, FLM is iV x iP, 
and in each column the elements corresponding to the tickers in that sub-industry 
are proportional to the hrst PC of the corresponding block, with all other elements 
vanishing.® The total variance condition is automatically satished. Then, applying 
the Russian-doll construction yields a nonsingular factor model covariance matrix, 
which, considering it sizably adds value in Sharpe ratio optimization for certain 
intraday mean-reversion alphas we backtest, appears to be stable out-of-sample. 

Heterotic risk models are based on our proprietary know-how. We hope sharing 
it with the investment community encourages organic custom risk models building. 

This paper is organized as follows. In Section 2 we briehy review some generali¬ 
ties of factor models and discuss in detail the total variance condition. In Section 3 
we discuss the PC approach and an algorithm for hxing the number of PC factors, 
with the R source code in Appendix A. We discuss heterotic risk models in detail in 
Section 4, with the complete Russian-doll embedding in Section 5 and the R source 
code in Appendix B. In Section 6 we run a horse race of intraday mean-reversion 
alphas via i) weighted regressions and ii) optimization using heterotic risk mod¬ 
els. For optimization with homogeneous linear constrains and (liquidity/position) 
bounds we use the R source code in Appendix C.® We briefly conclude in Section 7. 


® The number of non-binary style factors is at most of order 10 and does not pose a difficultly for 
computing the factor covariance matrix. It is the ubiquitous industry factors that are problematic. 
^ And not SCM - this is an important technical detail, see the discussion in Section 2.4. 

® Note that this is not the same as a“hybrid” (mixture) of industry and statistical risk factors. 
® The source code given in the appendices is not written to be “fancy” or optimized for speed 
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2 Multi-factor Risk Models 


2.1 Generalities 

In a multi-factor risk model, a sample covariance matrix Cij for N stocks, = 
1,..., iV, which is computed based on a time series of stock returns Ri {e.g., daily 
close-to-close returns), is modeled via a constructed covariance matrix Ty given by 

r = S + (1) 

H.; = g S„ (2) 

where Sij is the Kronecker delta; Tij is an iV x iV matrix; is the specihc risk (a.k.a. 
idiosyncratic risk) for each stock; VliA is an N x K factor loadings matrix; and ^ab 
is a. K X K factor covariance matrix, A,B = 1,... ,K, where K N. I.e., the 
random processes Tj corresponding to N stock returns are modeled via N random 
processes Xi (specihc risk) together with K random processes Ja (factor risk): 


K 


— Xi + ^iA /a 

(3) 

A=l 


Cov(xi, Xj) = ^ij 

(4) 

Cov(xi, /a) = 0 

(5) 

Cov(/a, /s) = <Fas 

(6) 

Cov(T„T,) = F,, 

(7) 


When M < N, where M -|- 1 is the number of observations in each time series, the 
sample covariance matrix Cij is singular with M nonzero eigenvalues. In contrast, 
assuming all > 0 and ^ab is positive-dehnite, then Tij is automatically positive- 
dehnite (and invertible). Furthermore, the off-diagonal elements of Cij typically are 
not expected to be too stable out-of-sample. On the contrary, the factor model 
covariance matrix Fj^ is expected to be much more stable as the number of risk 
factors, for which the factor covariance matrix ^ab needs to be computed, is K ^ N. 

2.2 Conditions on Total Variances 

The prime aim of a risk model is to predict the covariance matrix out-of-sample as 
precisely as possible, including the out-of-sample total variances. However, albeit 
this requirement is often overlooked in practical applications, a well-built factor 
model had better reproduce the in-sample total variances. That is, we require that 
the factor model total variance Ta coincide with the in-sample total variance Cu: 

K 

Tii = ^ ^iA ^AB ^iB = Cu (8) 

A,B=l 

or in any other way. Its sole purpose is to illustrate the algorithms described in the main text in 
a simple-to-understand fashion. Some legalese relating to this code is given in Appendix D. 
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A priori this gives N conditions^® for N + K{K + l)/2 unknowns and ^ab, so we 
need additional assumptions^^ to compute and ^ab- 


2.3 Linear Regression 

One such assumption - intuitively - is that the total risk should be attributed to the 
factor risk to the greatest extent possible, he., the part of the total risk attributed 
to the specihc risk should be minimized. One way to formulate this requirement 
mathematically is via least squares. First, mimicking (3), we decompose the stock 
returns Ri via a linear model 

K 

Ri = QiA Ja (9) 

A=1 

Here the residuals e* are not the same as Xi (3); in particular, generally the 
covariance matrix Cov{ei,ej) is not diagonal (see below). We can require that 

N 

Zi ef min (10) 

i=l 

where Zi > 0, and the minimization is w.r.t. Ja- This produces a weighted linear 
regression^^ with the regression weights Zi. So, what should these weights be? 


2.4 Correlations, Not Covariances 


While choosing unit weights Zi = 1 might appear as the simplest thing to do, this 
suffers from a shortcoming. Intuitively it is clear that - on average - the residuals e* 
are larger for more volatile stocks, so the regression with unit weights would produce 
skewed results.^® This can be readily rectihed using nontrivial regression weights. 
A “natural” choice is Zi = l/Cu. In fact, we have a regression with unit weights: 


K 

Ri ^i R ^ ^ /a 

A=1 


N 

i=l 


mm 


( 11 ) 

( 12 ) 


where Ri = Ri/^/Cu, VLia = ^iA/V^i, and q = €il\fCii on average are expected 
to be much more evenly distributed compared with - we have scaled away the 
volatility skewness via rescaling the returns, factor loadings and residuals by \fCii. 

With additional assumptions not all of these conditions are nontrivial (see below). 

There are no “natural” K(K + l)/2 conditions we can impose on i A 3 in terms of 
out-of-sample unstable Cij, i A 3- Note that the variances Cu typically are much more stable. 

Without the intercept, that is, unless the intercept is already subsumed in tiiA- 

Cross-sectionally, stock volatility typically has a roughly log-normal distribution. 
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So, we are now modeling the sample correlation matrix = Ci^j\fCii^fCjj 
(note that '^u = 1, while Id'jjl < 1 for i ^ via another factor model matrix 


K 


^ij ~ ^ij + 

(13) 

A,B=1 


where Tij = Tij/^/Cii\/Cjj, and = ^f/Cu. The solution to (12) 

is given by (in 

matrix notation) 


/ = r 

(14) 

€= [1-Q]R 

(15) 

q = q(q’^ 

(16) 

where Q is a projection operator: = Q. Consequently, we have: 


S = Cov {e,^) = [1 - g] T [1 - Q^] 

(17) 

H Cov (/, f) = g T g'^ 

(18) 


Note that the matrix S is not diagonal. However, the idea here is to identify Yi with 
the diagonal part of S: 

t = g« = (ll-Q]>I>[l-QT).. (19) 

and we have 

= + ( 20 ) 

Note that Yf dehned via (19) are automatically positive (nonnegative, to be precise 
- see below). However, we must satisfy the conditions (8), which reduce to 

K 

'^ii = ^ ^iA ^AB ^iB = d'ii = 1 (21) 

A,B=1 

and imply 

Tu = 0 ( 22 ) 

T = 2QmQ'^-Qm-mQ'^ (23) 

The N conditions (22) are not all independent. Thus, we have Tr(T) = 0. 

If Ri{ts) {ts labels the observations in the time series and in the above notations the in¬ 
dex s takes M -I- 1 values) are the time series of the stock returns based on which the sample 
covariance matrix Cij is computed (so Cu = Var(i?i(ts)), where the variance is serial), then 
is the sample covariance matrix for the “normalized” returns Ri{ts) = Ri{ts)/^/Cu, i.e., 
'i'ij = Cov {Ri{ts),Rj{ts)) = Cor{Ri{ts), Rj{ts)), where Cov(-,-) and Cor(-,-) are serial. 
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3 Principal Components 

The conditions (22) are nontrivial. They are not satished for an arbitrary factor 
loadings matrix VLi^. However, there is a simple way of satisfying these conditions, 
to wit, by building VLia from the principal components of the correlation matrix Tjj. 

Let a = be the N principal components of forming an or¬ 

thonormal basis 

N 

(24) 

i=i 

N 

E h“' h"’=(25) 

i=l 

such that the eigenvalues are ordered decreasingly: A*-^^ > A^^^ > .... More pre¬ 
cisely, some eigenvalues may be degenerate. For simplicity - and this is not critical 
here - we will assume that all positive eigenvalues are non-degenerate. However, we 
can have multiple null eigenvalues. Typically, the number of nonvanishing eigenval¬ 
ues^® is M, where, as above, M-|-1 is the number of observations in the stock return 
time series. We can readily construct a factor model with K < M: 

h,A = ^A^ (26) 

Then the factor covariance matrix ^ab = Sab and we have 

r., = S S„ + E y-'’ (27) 

A=1 

= 1 - E (i/yT (28) 

A=1 

SO Tjj = Tjj = 1. See Appendix B for the R code including the following algorithm. 

3.1 Fixing K 

When K = M we have T = 4/, which is singular.^® Therefore, we must have 
K < Kmax < M. So, how do we determine Kmax^ And is there Kmin other than 
the evident answer Kmin = 1? Here we can do a lot of complicated, even convoluted 
things. Or we can take a pragmatic approach and come up with a simple heuristic. 
Here is one simple algorithm that does a very decent job at hxing K. 

This number can be smaller if some stock returns are 100% correlated or anti-correlated. For 
the sake of simplicity - and this not critical here - we will assume that there are no such returns. 

Note that = Y1i^=k+i > 0. We are assuming A^) > 0, which (up to compu¬ 

tational precision) is the case if there are no N/As in the stock return times series. 
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The idea here is simple. It is based on the observation that, as K approaches 
M, min(4^) goes to 0 (he., less and less of the total risk is attributed to the specihc 
risk, and more and more of it is attributed to the factor risk), while as K approaches 
0, max(,^j^) goes to 1 (he., less and less of the total risk is attributed to the factor 
risk, and more and more of it is attributed to the specihc risk). So, as a rough cut, 
we can think of Kmax and Kmin as the maximum/minimum values of K such that 
min(5) > Cmin and max( 42 ) < where Cmm and C,max are some desired bounds 
on the fraction of the contribution of the specihc risk into the total risk. E.g., we 
can set Cmm = 10% and Cmax = 90%. In practice, we actually need to hx the value 
of K, not Kmax and Kmin, especially that for some preset values of Cmm and (max we 
may end up with Kmax < Kmin- However, the above discussion aids us in coming 
up with a simple heuristic dehnition for what K should be. Here is one: 


\g{K) — 1 —)■ min 

(29) 

g{K) = J min(g) + J max(g) 

(30) 


he., we take K for which g{K) (which monotonically decreases with increasing K) 
is closest to 1. This simple algorithm works pretty well in practical applications.^^ 

3.2 Limitations 

An evident limitation of the principal component approach is that the number of 
risk factors is limited by M. If long lookbacks are unavailabe/undesirable, as, e.g., 
in short-holding quantitative trading strategies, then typically M N. Yet, the 
number of the actually relevant underlying risk factors can be substantially greater 
than M, and most of these risk factors are missed by the principal component 
approach. In this regard, we can ask: can we use other than the first M principal 
components to build a factor model? The answer, prosaically, is that, without some 
additional information, it is unclear what to do with the principal components with 
null eigenvalues. They simply do not contribute to any sample factor covariance 
matrix. However, not all is lost. There is a way around this difficulty. 

4 Heterotic Construction 

4.1 Industry Risk Factors 

Without long lookbacks, the number of risk factors based on principal components 
is limited.^® However, risk factors based on a granular enough industry classihcation 

The distribution of (f is skewed; typically, (f has a tail at higher values, while In(^j^) has a 
tail at lower values, and the distribution is only roughly log-normal. So K is not (the floor/cap of) 
M/2, but somewhat higher, albeit close to it. See Table 1 and Figure 1 for an illustrative example. 
The number of style factors is also limited (especially for short horizons), of order 10 or fewer. 
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can be plentiful. Furthermore, they are independent of the pricing data and, in this 
regard, are insensitive to the lookback. In fact, typically they tend to be rather 
stable out-of-sample as companies seldom jump industries, let alone sectors. 

For terminological dehniteness, here we will use the BICS nomenclature for the 
levels in the industry classihcation, albeit this is not critical here. Also, BICS has 
three levels “sector —)■ industry —)■ sub-industry” (where “sub-industry” is the most 
granular level). The number of levels in the industry hierarchy is not critical here 
either. So, we have: N stocks labeled by i = 1,..., A^; K sub-industries labeled 
by A = 1,... ,K; F industries labeled by a = 1,... ,F; and L sectors labeled 
by q; = 1,...,L. More generally, we can think of such groupings as “clusters”. 
Sometimes, loosely, we will refer to such cluster based factors as “industry” factors.^® 

4.2 “Binary” Property 

The binary property implies that each stock belongs to one and only one sub¬ 
industry, industry and sector (or, more generally, cluster). Let G be the map between 
stocks and sub-industries, S be the map between sub-industries and industries, and 
W be the map between industries and sectors: 


G 



.,K} 

(31) 

S : 



■,X} 

(32) 

W 


..,F}^{1,. 

..,L} 

(33) 


The nice thing about the binary property is that the clusters (sub-industries, indus¬ 
tries and sectors) can be used to identify blocks (sub-matrices) in the correlation 
matrix Tjj. E.g., at the most granular level, for sub-industries, the binary matrix 
^G(i),A defines such blocks. Thus, the sum Ba = ^G{i),A Xi, where is an 

arbitrary A^-vector, is the same as Ylii<^j(A)^i'i where J{A) is the set of tickers in 
the sub-industry A. These blocks are the backbone of the following construction. 

4.3 Heterotic Models 

Consider the following factor loadings matrix: 

^iA = ^G(i)A Ui (34) 

U^ = [U{A)]i, ieJ{A), A = l,...K (35) 

where J{A) = {i|G(i) = A} is the set of tickers (whose number N{A) = |J(A)|) 
in the sub-industry labeled by A. Then the A^(A)-vector U{A) is the /frst principal 
component of the N{A) x A^(A) matrix T(A) dehned via [T(A)]jj = i,j G J{A). 
(Note that X^jej(^)[f^(A)]^ = 1; also, let the corresponding (largest) eigenvalue of 

Albeit in the BICS context we may be referring to, e.g., sub-industries, while in other classi¬ 
fication schemes the actual naming may be altogether different. 



be A(74).)^° With this factor loadings matrix we can compute the factor 
covariance matrix and specihc risk via a linear regression as above, and we get: 

g = 1 - A(G(*)) (36) 

=u,u, Y. Y. Uk'^kiUi ( 37 ) 

keJ(G(i)) l£J{G{j)) 

so we have^^ 

f,, = [1 - A(G(^)) U^] S,, + U, U, Y Y (38) 

fceJ(G(i)) ieJ{G{j)) 

and automatically Ta = 1. This simplicity is due to the use of the (hrst) principal 
components corresponding to the blocks '1/(A) of the sample correlation matrix. 

4.3.1 Multiple Principal Components 

For the sake of completeness, let us discuss an evident generalization. Above in 
(34) we took the binary map between the tickers and sub-industries and augmented 
it with the hrst principal components of the corresponding blocks in the sample 
correlation matrix. Instead of taking only the hrst principal component, we can take 
the hrst P{A) > 1 principal components for each block labeled by the sub-industry 
A {A = l,...,iF). Then we have K = -^(^) factors labeled by pairs 

A = (A, I), where for a given value of A we have / G D{A) (with |71(A)| = P{A)). 
The factor loadings matrix reads: 

A.? = ^cm.A |C'(.4)|P (39) 

where U{A) is the N{A) x P{A) matrix whose columns are the hrst P{A) principal 
components (with eigenvalues [A(A)]*''^^) of the N{A) x N{A) matrix 'h(A) (as above, 
I't(^)l 3 = * 3 . i,i e J(A), and /V £ D(Ay) In 

order to have nonvanishing specihc risks, it is necessary that we take P{A) < M 
[M -|- 1 is the number of observations in the time series). We then have 

ry= 1- ^ lA(G(i))|<') (lC/(G(i))|.''>)' 3« + 

/eD(G(i)) 

+ E E |t'(GW)ir |£'(G(j))1F X 

ieD(G(i)) JeD(G(j)) 

X E E [g(g(0)i 7 *4. |G(G(j))];''' (40) 

k&J(G(i)) l&J(G{j)) 

and (as in the case above with all P{A) = 1) automatically Ta = 1. 

If N{A) = 1, i.e., we have only one ticker in the sub-industry labeled by A, then [[/(A)]j = 1 
and A(4) = '^u = 1, i £ J(A). 

For single-ticker sub-industries (tV(A) = 1) the specific risk vanishes: = 0; however, this 

does not pose a problem as this does not cause the matrix F^ to be singular (see below). 
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4.3.2 Caveats 


The above construction might look like a free lunch, but it is not. Let us start 
with the ^*(^4) = 1 case (hrst principal components only). For short lookbacks, the 
number of risk factors typically is too large: K can easily be greater than M, so^^ 

= E E («) 

ieJ(A) jeJ(B) 

is singular. In general, the sample factor covariance matrix is singular if iL > M. We 
will deal with this issue below via the nested Russian-doll risk model construction. 

This issue is further exacerbated in the multiple principal component construc¬ 
tion (with at least some P{A) > 1) as the number of risk factors K > K is even 
larger. This too can be dealt with via the Russian-doll construction. However, there 
is yet another caveat pertinent to using multiple principal components, irrespective 
of whether the factor covariance matrix is singular or not. The principal components 
are based on off-diagonal elements of and tend to be unstable out-of-sample, the 
hrst principal component typically being the most stable. So, for the sake of sim¬ 
plicity, below we will focus on the case with only hrst principal components. 

5 Russian-Doll Construction 

5.1 General Idea 

As discussed above, the sample factor covariance matrix ^ab is singular if the num¬ 
ber of factors K is greater than M. The simple idea behind the Russian-doll con¬ 
struction is to model such ^ab itself via yet another factor model matrix T'^^ (as 
opposed to computing it as a sample covariance matrix of the risk factors 

Tab = V4>aaa /^bb ^'ab (42) 

F 

T'ab = (43) 

a,6=1 

where S,'a is the specihc risk for the “normalized” factor return /a = fA/V^AA] 
^Aaj A = 1,... ,K, a = l,...,Fis the corresponding factor loadings matrix; and 
is the factor covariance matrix for the underlying risk factors f^, a = 1,..., F, 
where we assume that F K. If the smaller factor covariance matrix is still 
singular, we model it via yet another factor model with fewer risk factors, and so on 
- until the resulting factor covariance matrix is nonsingular. If, at the hnal stage, 
we are left with a single factor, then the resulting 1x1 factor covariance matrix is 
automatically nonsingular - it is simply the sample variance of the remaining factor. 

Note that <i>AA = A(T). 

We use a prime on etc. to avoid confusion with ^ab, etc. 
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5.2 Complete Heterotic Russian-Doll Embedding 

For concreteness we will use the BICS terminology for the levels in the industry 
classihcation, albeit this is not critical here. Also, BICS has three levels “sector —)■ 
industry —)■ sub-industry” (where “sub-industry” is the most granular level). For 
dehniteness, we will assume three levels here, and the generalization to more levels 
is straightforward. So, we have: N stocks labeled hj i = 1,..., N] K sub-industries 
labeled by A = 1,..., K; F industries labeled by a = 1,..., F; and L sectors labeled 
by q; = 1,..., L. A nested Russian-doll risk model then is constructed as follows: 


where 


and 


^ ij \/Cii \J Cjj r^j 

(44) 

~ ^G{i),G{j) 

(45) 

^'ab ^ V^aaV^bb r'^s 

(46) 

Fb = (F)' 6 ab + U'^ u'b rS(^i,B(fl) 

(47) 


(48) 

C = + K Ui; r;;., 

(49) 

Kp = r- 

(50) 

Kp = iCf <5^/3 + u': c; 

(51) 

g = 1 - A(G(*)) G 

(52) 

{Gf = 1 - A'(A(A)) {U'a? 

(53) 

{Cf = 1 - A"(IF(a)) {U':f 

(54) 

{Cf = 1 - A'" {Kf 

(55) 

^ab= Y. Y. Uj 

(56) 

ieJ{A) jGJ{B) 


E E fFbF 

(57) 

AeJ'{a) BeJ'ib) 


E E FnF 

(58) 

aej"(a) b£j"{l3) 


i>"' = E F 'Kb v; 

(59) 

q ;,/ 3=1 


A(A), = A'(a), = A"(a), and = A'" 

(see below). 


Also, J{A) = = A} {Na = |^(A)| tickers in sub-industry A), J'{a) = 
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{74|S'(74) = a} {N'{a) = \ J'{a)\ sub-industries in industry a), J"{a) = {a\W{a) = a} 
{N"{a) = \ J"{a)\ industries in sector a), and the maps G (tickers to sub-industries), 
S (sub-industries to industries) and W (industries to sectors) are defined in (31), 
(32) and (33). Furthermore, 


hj — Cij! \fcli\/ Cjj 

(60) 

’ab — ^Ab/\/^AA V^BB 

(61) 

Kt = 

(62) 

Kp = 

(63) 

• = [17(A)],, t E J{A) 

(64) 

= [U’ia)]A, A e J'(a) 

(65) 

= [U”{a)U a E J'\a) 

(66) 


The N (A)-vector U (A) is the first principal component of T (A) with the eigenvalue 
A(74) ([^(A)]*^ = Tjj, i,j G the iV'(a)-vector U'{a) is the first principal 

component of tl''(a) with the eigenvalue A'(a) ([^'(a)]^^ = ^'ab) A,B E J'{a)), the 
A^"(Q;)-vector U"{a) is the first principal component of \h"(Q;) with the eigenvalue 
A"(q;) ([^"(q;)]^;, = Ci,h E J'\a)), while U’” is the first principal component of 
with the eigenvalue A'". The vectors U{A), U\a) and U"{a) are normalized, so 

U't = 1, = 1, = 1. and also = 1- 

For the sake of completeness, above we included the step where the sample factor 
covariance matrix for the sectors is further approximated via a 1-factor model 
F"(g. If computed via (58) is nonsingular, then this last step can be omitted, 
so at the last stage we have L factors (as opposed to a single factor).Similarly, if 
we have enough observations to compute the sample covariance matrix for the 
industries, we can stop at that stage. Finally, note that in the above construction 
we are guaranteed to have (C"')^ > 0; ('CaO^ > 0; > 0 ^ 0 (with the last 

equality occurring only for single-ticker sub-industries and not posing a problem - 
see below).In Appendix B we give the R code for bnilding heterotic risk models. 


That is, assuming there are enough observations in the time series for out-of-sample stability. 

This last factor can be interpreted as the “market” risk factor. For the sake of com¬ 
pleteness, the definitions of the factors at each stage are as follows: (i) for the sub-industries 
Ia = J2iGJiA) Ai, where Ri = Rily/CTi] (ii) for the industries /' = Y^A^j^a) ^A /a. where 
Ia = /a/V^aa; (hi) for the sectors = EaeJ'gc.) f'a^ ’^liere /' = and (iv) for the 

“market” /'" = YL=x /"> where /" = 

For a typical, large trading universe, industries and sectors usually contain more than one 
ticker; however, there can be cases of single-ticker sub-industries. Nonetheless, Fy is nonsingular. 
Indeed, for an arbitrary A^-vector Xi we have X"^ F X > 0 unless Xi = 0, i ^ H, where H = 
{i\N{G{i)) = 1}. For such X^ we have X'^ F A" = Ya b&e ^a TAb Yb > 0, where E = {A|A^(T) = 
1}, Ya = A E E, and we have taken into account that by construction F^b (and its 

sub-matrix with A,B E E) is positive-definite, and also that Ui = 1, i E H. More on this below. 


12 



5.3 Model Covariance Matrix and Its Inverse 


The model covariance matrix is given by T^ defined in (44). For completeness, let 
us present it in the “canonical” form: 


where 


K 

A,B=1 



^iA = Ui ( 5 , 


G(i),A 


$ 


* _ "p/ 

AB = ^ AB 


(67) 


( 68 ) 

(69) 

(70) 


where is defined in (52), Ui is defined in (64), is defined in (46), and we use 
the star superscript in the our factor covariance matrix (which is nonsingular) 
to distinguish it from the sample factor covariance matrix ^ab (which is singular). 

In many applications, such as portfolio optimization, one needs the inverse of 
the matrix F. When we have no single-ticker sub-industries, the inverse is given by 
(in matrix notation) 

F-i _ ^-1 _ ^-1 ^ ^-1 2-1 ( 71 ) 

A = (4>*)-^ + 12'^ S-i (72) 

S = diag(C,2) (73) 


However, when there are some single-ticker sub-industries, the corresponding = 0, 
i ^ H {H = {i|iV(G(i)) = 1}), so (71) “breaks”. Happily, there is an easy “fix”. 
This is because for such tickers the specific risk and factor risk are indistinguishable. 
Recall that Ui = 1, i G H, and = 1, A ^ E (E = {H|iV(H) = 1}). We can 
rewrite Fj^ via 

K 

= 3 h + E ('^4) 

A,B=1 

where: for i ^ H] = Cu Q for i G 77 with arbitrary Q, 0 < Q < 1] 

^AB — ^*AB A A ^ E OT B ^ E or A ^ B; and ^*aa — ^A,G{i) 0 fon A E E. (Here 
we have taken into account that Ui = 1, i E H.) Now we can invert F via 

F-i ^ g-i _ e-i ^ ^-1 g-i (75) 

A = ($*)-^ + §-1 (76) 

S = diag(g) (77) 


Note that, due to the factor model structure, to invert the N x N matrix F, we 
only need to invert two K x K matrices <F* and A. If there are no single-ticker 
sub-industries, then itself has a factor model structure and involves inverting 
two E X E matrices, one of which has a factor model structure, and so on. 
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6 Horse Race 


So, suppose we have built a complete heterotic risk model. How do we know it 
adds value? I.e., how do we know that the off-diagonal elements of the factor model 
covariance matrix Fjj are stable out-of-sample to the extent that they add value. 
We can run a horse race. There are many ways of doing this. Here is one. For a 
given trading universe we compute some expected returns, e.g., based on overnight 
mean-reversion. We can construct a trading portfolio by using our heterotic risk 
model covariance matrix in the optimization whereby we maximize the Sharpe ratio 
(subject to the dollar neutrality constraint). On the other hand, we can run the same 
optimization with a diagonal sample covariance matrix diag(C'jj) subject to neutral¬ 
ity (via linear homogeneous constraints) w.r.t. the underlying heterotic risk factors 
(plus dollar neutrality).^^ In fact, optimization with such diagonal covariance matrix 
and subject to such linear homogeneous constraints is equivalent to a weighted cross- 
sectional regression with the loadings matrix (over which the expected returns are 
regressed) identihed with the factor loadings matrix (angmented by the intercept, 
i.e., the nnit vector, for dollar nentrality) and the regression weights identihed with 
the inverse sample variances l/Cu (see (Kakushadze, 2015a) for details). So, we will 
refer to the horse race as between optimization (using the heterotic risk model) and 
weighted regression (with the aforementioned linear homogeneous constraints).^® 

6.1 Notations 

Let Pis be the time series of stock prices, where i = 1,..., iV labels the stocks, and 
s = 0,1,... ,M labels the trading dates, with s = 0 corresponding to the most 
recent date in the time series. The snperscripts O and C (nnadjusted open and 
close prices) and AO and AC (open and close prices fully adjusted for splits and 
dividends) will distinguish the corresponding prices, so, e.g., is the nnadjusted 
close price. Vis is the unadjusted daily volume (in shares). Also, for each date s we 
dehne the overnight retnrn as the previous-close-to-open return: 

E„ = hi {Pii°/PCf+,) (78) 

This retnrn will be used in the dehnition of the expected return in our mean-reversion 
alpha. We will also need the close-to-close retnrn 

= In (79) 

An ont-of-sample (see below) time series of these returns will be used in constructing 
the heterotic risk model and computing, among other things, the sample variances 
Cii. Also note that all prices in the dehnitions of Eis and Ris are fully adjusted. 

For comparative purposes, we will also run separate backtests where we require neutrality 
w.r.t. BIGS sub-industries and principal components. 

The remainder of this section somewhat overlaps with Section 7 of (Kakushadze, 2015a) as 
the backtesting models are similar, albeit not identical. 
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We assume that: i) the portfolio is established at the open^® with hlls at the 
open prices ii) it is liquidated at the close on the same day - so this is a purely 
intraday alpha - with fills at the close prices PP, and iii) there are no transaction 
costs or slippage - our aim here is not to build a realistic trading strategy, but to 
test that our heterotic risk model adds value to the alpha. The P&L for each stock 


ffjs Pis 


p.c 

IS _ -1 

W 

IS 


( 80 ) 


where His are the dollar holdings. The shares bought plus sold (establishing plus 
liquidating trades) for each stock on each day are computed via Qis = 2\His\/P^. 


6.2 Universe Selection 

For the sake of simplicity,we select our universe based on the average daily dollar 
volume (ADDV) dehned via (note that Ais is out-of-sample for each date s): 

1 

A. = 5 E -P£+- (81) 

r=l 

We take d = 21 (he., one month), and then take our universe to be the top 2000 
tickers by ADDV. To ensure that we do not inadvertently introduce a universe 
selection bias, we rebalance monthly (every 21 trading days, to be precise). I.e., 
we break our 5-year backtest period (see below) into 21-day intervals, we compute 
the universe using ADDV (which, in turn, is computed based on the 21-day period 
immediately preceding such interval), and use this universe during the entire such 
interval. We do have the survivorship bias as we take the data for the universe of 
tickers as of 9/6/2014 that have historical pricing data on http://hnance.yahoo.com 
(accessed on 9/6/2014) for the period 8/1/2008 through 9/5/2014. We restrict this 
universe to include only U.S. listed common stocks and class shares (no OTCs, 
preferred shares, etc.) with BIGS sector, industry and sub-industry assignments as 
of 9/6/2014.^^ However, as discussed in detail in Section 7 of (Kakushadze, 2015a), 
the survivorship bias is not a leading effect in such backtests. 


6.3 Backtesting 

We run our simulations over a period of 5 years (more precisely, 1260 trading days 
going back from 9/5/2014, inclusive). The annualized return-on-capital (ROC) is 

This is a so-called “delay-0” alpha: the same price, PF (or adjusted P^^), is used in com¬ 
puting the expected return (via Eis) and as the establishing fill price. 

In practical applications, the trading universe of liquid stocks typically is selected based on 
market cap, liquidity (ADDV), price and other (proprietary) criteria. 

The choice of the backtesting window is based on what data was readily available. 

Here we are after the relative outperformance, and it is reasonable to assume that, to the 
leading order, individual performances are affected by the survivorship bias approximately equally. 
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computed as the average daily P&L divided by the intraday investment level I (with 
no leverage) and multiplied by 252. The annualized Sharpe Ratio (SR) is computed 
as the daily Sharpe ratio multiplied by \/252. Cents-per-share (CPS) is computed 
as the total P&L divided by the total shares traded. 

6.4 Weighted Regression Alphas 

We will always require that our portfolio be dollar neutral: 

N 

= 0 (82) 

i=l 


We will further require neutrality 


N 

His AiAs = 0 (83) 

i=l 

with the three different incarnations for the loadings matrix (for each trading 
day s, so we omit the index s)^"^ dehned via: 

principal components: Vi^\ ^ = 1, • • •, -^pc (84) 

sub-industries: Aj^ = ^G(i),A, A = 1,..., K (85) 

heterotic risk factors: Aia = Ui SG{i),A, A = 1,..., K (86) 

Here the hrst Kpc principal components (with the eigenvalues of the 

sample correlation matrix Tjj. For each date s we take M-|-1 = d = 21 trading days 
as our lookback (he., the number of observations) in the out-of-sample time series 
of close-to-close (see (79)) returns (i?i,(s+i), i?i,(s+ 2 ),..., Ri^i^s+d)) (based on which we 
compute the sample covariance (correlation) matrix Cijs ('Lps) for each s), so the 
number of the nonvanishing eigenvalues > 0 is M = 20, and we take Kpc = M. 
Further, the map G between tickers and sub-industries is dehned in (31), and K is 
the number of sub-industries.^® Finally, the vector Ui in (86) is dehned in (64). 

For each date labeled by s, we run cross-sectional regressions of the overnight 
(see (78)) returns 77** over the corresponding loadings matrix, call it Y (with indices 

As mentioned above, we assume no transaction costs, which are expected to reduce the ROC 
of the optimization and weighted regression alphas by the same amount as the two strategies trade 
the exact same amount by design. Therefore, including the transaction costs would have no effect 
on the actual relative outperformance in the horse race, which is what we are after here. 

The loadings in (84) and (86) are computed for each trading date s (as opposed to, say, 
every 21 days - see below); in (85) they change only with the universe (every 21 days). 

The factor in (84) does not affect the regression residuals below. 

In (85) we deliberately take A^^ = SG{i),A as opposed to AiA = \/Cii SG{i),A (see below). Note 
that with (85) the intercept is subsumed in AiA as we have ^^a = 1, so (82) is automatic. 
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suppressed), which has 3 different incarnations: i) for principal components, Y is an 
N X {Kpc + 1) matrix, whose first column in the intercept (unit iV-vector), and the 
remaining columns are populated by Aj^ defined in (84); ii) for sub-industries, the 
elements of Y are the same as defined in (85); and iii) for heterotic risk factors, 
y is an A^ X {K -|-1) matrix, whose first column in the intercept (unit A^-vector), and 
the remaining columns are populated by A,^ defined in (86). We take the regression 
weights to be Zi = IjCa. More precisely, to avoid unnecessary variations in the 
weights Zi (as such variations could result in unnecessary overtrading), we do not 
recompute Zi daily but every 21 trading days, same as with the trading universe. 

In the cases i)-iii) above, we compute the residuals Sis of the weighted regression 
and the dollar holdings His via (we use matrix notation and suppress indices): 


E = Ze = 

His — ^is 


Z[E-Y (Y^ Z Y)-^ 
I 



Y^ Z E] 


(87) 

( 88 ) 


where Z = diag( 2 :i), we have dollar neutrality (82),^'^ and \His\ = I (the total 
intraday dollar investment level (long plus short), which is the same for all dates s). 

The simulation results are given in Table 2 and P&Ls for the 3 cases i)-iii) are 
plotted in Figure 2. For comparison purposes - and to alley any potential concerns 
that the results in Table 2 may not hold for realistic position bounds, in Table 3 we 
give the simulation results for the same cases i)-iii) above with the strict bounds 


H,s\ < 0.01 Ai, 


(89) 


SO not more than 1% of each stock’s ADDV is bought or sold. We use the bounded 
regression algorithm and the R source code of (Kakushadze, 2015b) to run these 
simulations. Expectedly, the liquidity bounds (89) lower ROC and CPS while im¬ 
proving SR, but in the same fashion for all 3 weighted regression alphas i)-iii). The 
results in Tables 2 and 3 confirm our prior intuitive argument that the sub-industries 
outperform the principal components simply because they are more numerous.^® 

If we compute Aj^ in (84) and (86) every 21 trading days (instead of daily - see 
fn. 34), the difference is very slight. E.g., for the heterotic risk factors computed 
every 21 days (with no bounds) we get: ROC = 51.66%, SR = 13.42, CPS = 2.26. 

Finally, let us also mention that, in the weighted regressions ii) and iii), the 
dollar holdings for the tickers in the single-ticker sub-industries are automatically 
null. This is not the case for optimized alphas (see below). Generally, if single-ticker 
(or small) sub-industries are undesirable, one can “prune” the industry hierarchy 
tree by merging (single-ticker and/or small) sub-industries at the industry level. 

Due to Eis having 0 cross-sectional means, which in turn is due to the intercept either being 
included (the cases i) and iii)), or being subsumed in the loadings matrix Y (the case ii)). 

In Table 2 the heterotic risk factors outperform the sub-industries. However, this is largely 
an artifact of defining AiA as in (85). If we take = y/Cu ^G{i),A instead (and augment the 
regression loadings matrix Y with the intercept for dollar neutrality), we will get (without the 
bounds (89) - the results with the bounds are similar): ROC = 51.62%, SR = 13.45, CPS = 2.26. 
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6.5 Optimized Alphas 

As mentioned above, our goal is to determine whether the heterotic risk model 
construction adds value by comparing the simulated performance of the weighted 
regression alphas above to the simulated performance of the optimized alphas (via 
maximizing the Sharpe ratio) based on the same expected returns Eig. In maximiz¬ 
ing the Sharpe ratio, we use the heterotic risk model covariance matrix Tij given 
by (44), which we compute every 21 trading days (same as for the universe). For 
each date (we omit the index s) we maximize the Sharpe ratio subject to the dollar 
neutrality constraint: 

.. _ E", H, E, 

V'eA. r., ft H, 

N 

i=l 

The solution is given by 

r AT N sr 

-Erd- 

Lj=i j=i 

where F”^ is the inverse of F (see Subsection 5.3), and the overall normalization 
constant 7 > 0 (this is a mean-reversion alpha) is fixed via the requirement that 

N 

E i»‘i = ’ 

i=l 

Note that (92) satishes the dollar neutrality constraint (91). 

The simulation results are given in Table 2 in the bottom row. The P&L plot 
for this optimized alpha is included in Figure 2. For the same reasons as in the case 
of weighted regression alphas, in the bottom row of Table 3 we give the simulation 
results for the same optimized alpha with the strict liquidity bounds (89).^® We 
use the optimization algorithm for maximizing the Sharpe ratio subject to linear 
homogeneous constraints and bounds discussed in (Kakushadze, 2015a).Also, in 
the second rows in Tables 2 and 3 we have included the simulation results for the 
optimized alpha where in the optimization we use the risk factor model covariance 
matrix Fjj based on the principal components discussed in Section 3.^^ From our 

In Tables 2 and 3 at the final stage the heterotic risk factors are the (10) BIGS sectors: there 
are enough (20) observations in the time series. The 1-factor model gives almost the same results. 

The source code for this algorithm is not included in (Kakushadze, 2015a), so we include it in 
Appendix C. It is similar to the source code of (Kakushadze, 2015b) for the bounded regression. 

This matrix is given by Fy = sJCa^Cjj where Fy is defined in (27), and K is determined 
via the algorithm of Section 3.1. For the d = 21 trading day lookback in our backtests, the value 
of K fixed by this algorithm turns out to be AT = 13. 


(93) 


Hi = 


-7 


Eg' 


E, 


^k,l=l 


-1 

kl 


El 


l^k,l=l ^ 


-1 

kl 


(92) 


—)■ max 


(90) 

(91) 
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simulation results in Tables 2 and 3 it is evident that the heterotic risk model predicts 
off-diagonal elements of the covariance matrix (that is, correlations) out-of-sample 
rather well. Indeed, using it in the optimization sizably improves ROC, SR and CPS 
compared with the weighted regressions with all three loadings i)-iii) above. 

7 Concluding Remarks 

The heterotic risk model construction we discuss in this paper is based on a “het¬ 
erosis” of: i) granularity of an industry classihcation; ii) diagonality of the principal 
component factor covariance matrix for any sub-cluster of stocks; and hi) dramatic 
reduction of the size of the factor covariance matrix in the Russian-doll construction. 
This is a powerful approach, as is evident from the horse race we ran above. 

Naturally, one may wonder if we can extend our construction to risk models which 
do not include any statistical risk factors {i.e., principal components) or include other 
non-binary factors such as style factors. A key simplifying feature in the heterotic 
construction is that the industry classihcation, which is used as the backbone (and 
is augmented with the principal components to satisfy the conditions (8)), is binary. 
Once non-binary risk factors are included, it is more nontrivial to compute the 
specihc risk and the factor covariance matrix (such that (8) are satished). However, 
there exist proprietary algorithms for dealing with this, which are outside of the 
scope of this paper. We hope to make these algorithms a public knowledge elsewhere. 

One hnal remark concerns purely statistical risk models based on principal com¬ 
ponents. Albeit their market share is rather limited, it is unclear why a portfolio 
manager would be willing to pay for such models considering that they are straight¬ 
forward to build in-house, especially now that we have provided the source code for 
constructing them. One argument is that using option implied volatility (which is 
available only for optionable stocks) to model stock volatility should work better, 
and if a portfolio manager does not possess the implied volatility data or the know¬ 
how for incorporating it into a statistical risk model, he or she would be better off 
simply buying one from a provider. However, this argument appears to be thin, at 
best. Nowadays, with ever-shortening lookbacks, it is unclear if the implied volatility 
indeed adds any value when the risk model is used in actual portfolio optimization 
for actual alphas. In this regard a new study would appear to be warranted. In 
any event, as we saw above, heterotic risk models outperform principal component 
risk models by a signihcant margin, so one can build heterotic risk models in-house 
(instead of buying less powerful statistical models) now that this know-how is in 
the public domain. The only data needed to construct a heterotic risk model is: i) 
adjusted close prices; and ii) a granular enough binary industry classihcation, such 
as GIGS, BIGS, IGB, etc. Most quantitative traders already have this data in-house. 
So, we hope this paper further encourages/aids organic custom risk model building. 

There exist further (proprietary) performance improvements using the heterotic risk model. 

In this context, the paper (Ederington and Guan, 2002) sometimes is referred to. 
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A R Code: Principal Component Risk Model 

In this appendix we give the R (R Package for Statistical Computing, http://www.r- 
project.org) source code for building a purely statistical risk model (principal com¬ 
ponents) based on the algorithm we discuss in Section 3, including the algorithm 
for hxing the number of factors K in Section 3.1. The code below is essentially self- 
explanatory and straightforward. It consists of a single function qrm.cov.pc(ret, 
use.cor = T). The input is; i) ret, an N X d matrix of returns {e.g., daily close-to- 
close returns), where N is the number of tickers, d is the number of observations in 
the time series {e.g., the number of trading days), and the ordering of the dates is 
immaterial; and ii) use. cor, where for TRUE (default) the risk factors are computed 
based on the principal components of the sample correlation matrix whereas 
for FALSE they are computed based on the sample covariance matrix Cij. The out¬ 
put is a list; result$spec.risk is the specihc risk (not the specihc variance ^f), 
result$f ac. load is the factor loadings matrix QiA = \/Cii ^iAi result$f ac. cov is 
the factor covariance matrix ^ab (with the normalization (26) for the factor load¬ 
ings matrix, ^ab = (^ab), result$cov.mat is the factor model covariance matrix 
Tjj = yjCii^Cjj Tij, and result$inv.cov is the matrix T”-^ inverse to Tij. 

qrm.cov.pc <- function (ret, use.cor = T) 

{ 

print("Running qrm.cov.pc()...") 

tr <- apply(ret, 1, sd) 
if(use.cor) 

ret <- ret / tr 

d <- ncol(ret) 

X <- t(ret) 

X <- var(x, x) 
tv <- diag(x) 

X <- eigen(x) 

g.prev <- 999 
for(k in 1:(d-1)) 

{ 

u <- x$values[1:k] 

V <- x$vectors[, l:k] 

V <- t(sqrt(u) * t(v)) 

x.f <- V t(v) 

x.s <- tv - diag(x.f) 
z <- x.s / tv 

g <- abs(sqrt(min(z)) + sqrt(max(z)) - 1) 
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if(g > g.prev) 
break 

g.prev <- g 

spec.risk <- sqrt(x.s) 
fac.load <- v 
fac.cov <- diagd, k) 
cov.mat <- diag(x.s) + x.f 

} 

k <- k - 1 

y.s <- 1 / spec.risk / spec.risk 
V <- fac.load 
vl <- y.s * V 

inv.cov <- diag(y.s) - vl %*% solve(diag(l, k) + t(v) vl) 7o*°/o t(vl) 

if(use.cor) 

{ 

spec.risk <- tr * spec.risk 
fac.load <- tr * fac.load 
cov.mat <- tr * t(tr * cov.mat) 
inv.cov <- t(inv.cov / tr) / tr 

} 


result <- new.envO 
result$spec.risk <- spec.risk 
result$fac.load <- fac.load 
result$fac.cov <- fac.cov 
result$cov.mat <- cov.mat 
result$inv.cov <- inv.cov 
result <- as.list(result) 
return(result) 

} 

B R Code: Heterotic Risk Model 

In this appendix we give the R source code for building the heterotic risk model 
based on the algorithm we discuss in Section 5.2. The code below is essentially self- 
explanatory and straightforward as it simply follows the formulas in Section 5.2. It 
consists of a single function qrm.het(ret, ind, mkt.fac = F, rm.sing.tkr = F). 
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The input is: i) ret, an N x d matrix of returns {e.g., daily close-to-close returns), 
where N is the number of tickers, d is the number of observations in the time series 
{e.g., the number of trading days), and the ordering of the dates is immaterial; ii) 
ind is a list whose length a priori is arbitrary, and its elements are populated by 
the binary matrices (with rows corresponding to tickers, so dim(ind[[-]] ) [1] is N) 
corresponding to the levels in the input binary industry classihcation hierarchy in 
the order of decreasing granularity (so, in the BICS case ind[[l]] is the N x K 
matrix 60 ( 1 ),a (sub-industries), ind[[2]] is the N x F matrix (5G'(j),a (industries), 
and ind[[3]] is the N x L matrix SG»{i),a (sectors), where the map G is dehned in 
(31) (tickers to sub-industries), G' = GS (tickers to industries), and G” = GSW 
(tickers to sectors), with the map S (sub-industries to industries) dehned in (32), 
and the map W (industries to sectors) dehned in (33)); hi) mkt.fac, where for TRUE 
at the hnal step we have a single factor (“market”), while for FALSE (default) the 
factors correspond to the least granular level in the industry classihcation hierarchy; 
and iv) rm.sing.tkr, where for TRUE the tickers corresponding to the single-ticker 
clusters at the most granular level in the industry classihcation hierarchy (in the 
BICS case this would be the sub-industry level) are dropped altogether, while for 
FALSE (default) the output universe is the same as the input universe. The out¬ 
put is a list; result$spec.risk is the specihc risk (not the specihc variance 
result$f ac. load is the factor loadings matrix QiA = \/Cii ^iAi result$f ac. cov is 
the factor covariance matrix ^abi result$cov.niat is the factor model covariance 
matrix Tij = s/Cii^JCjj Tij, and result$inv.cov is the matrix T'd inverse to Tij. 

qrm.het <- function (ret, ind, mkt.fac = F, rm.sing.tkr = F) 

{ 

print("Running qrm.het()...") 

if(rm.sing.tkr) 

{ 

bad <- colSums(ind[[1]]) == 1 
ind[[l]] <- ind[[l]][, !bad] 
bad <- rowSums(ind[[1]]) == 0 
fordvl in 1: length (ind)) 

ind[[lvl]] <- ind[ [Ivl]] [!bad, ] 
ret <- ret[!bad, ] 

} 


cov.mat <- listO 
u <- listO 
flm <- ind 

calc.load <- function(load, loadl) 

{ 
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X <- colSums(loadl) 

load <- (t(loadl) load) / x 

return(load) 

} 

calc. cor .mat <- functioii(cov.mat) 

{ 

tr <- sqrt(diagCcov.mat)) 
cor.mat <- t(cov.mat / tr) / tr 
return(cor.mat) 

} 

calc.cov.mat <- function(cor.mat, tr) 

{ 

cov.mat <- t(cor.mat * tr) * tr 
return(cov.mat) 

} 

cov.mat[[1]] <- var(t(ret), t(ret)) 
cor.mat <- calc.cor.mat(cov.mat[[1]]) 

fordvl in 1: length (ind)) 

{ 

ifdvl > 1) 

flm[[lvl]] <- calc. loaddnd [ [Ivl] ] , ind [ [lvl-1] ]) 

u[[lvl]] <- rep(NA, nrow(fIm[ [Ivl]])) 

for(a in 1:ncol(fIm[[Ivl]])) 

{ 

take <- as.logical(flm[[Ivl]][, a]) 

X <- cor.mat[take, take] 
y <- eigen(x)$vectors 

y <- y[, 1] 

u[[Ivl]][take] <- y 

} 

flm[[Ivl]] <- u[[lvl]] * flm[[Ivl]] 

cm <- cov.mat[[lvl +1]] <- t (f Im [ [Ivl] ] ) 7o*7o cor. mat 7o*7o flm [[Ivl]] 
cor.mat <- calc.cor.mat(cm) 

} 
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if(!mkt.fac) 

mod.mat <- cm 
else 
{ 

k <- nrow(cor.mat) 

X <- eigen(cor.mat) 
y <- x$vectors 

y <- y[, 1] 

z <- x$values[1] 

mod.mat <- matrix(z, k, k) 
mod.mat <- t(mod.mat * y) * y 
diag(mod.mat) <- 1 

mod.mat <- calc.cov.mat(mod.mat, sqrt(diag(cm))) 

} 

fordvl in length(ind) : 1) 

{ 

fac.cov <- mod.mat 

mod.mat <- flm[[lvl]] 7o*7o mod.mat 7o*7o t (f Im[ [Ivl] ]) 

sv <- diagd - mod.mat) 

diag(mod.mat) <- 1 

tr <- sqrt(diag(cov.mat[ [Ivl]])) 

mod.mat <- calc.cov.mat(mod.mat, tr) 

} 

if(!rm.sing.tkr) 

{ 

take <- colSums(ind[[1]]) == 1 
X <- diag(fac.cov) 

X [take] <- 0 
diag(fac.cov) <- x 
if(sum(take) > 1) 

take <- rowSums(ind[[1]][, take]) == 1 
else if (sum(take) == 1) 

take <- as.vector(ind[[1]][, take]) == 1 
else 

take <- rep(F, nrowdnd[ [1] ])) 
sv[take] <- 1 

} 

spec.risk <- tr * sqrt(sv) 
fac.load <- tr * flm[[l]] 
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V <- flm[[l]] / sv 

d <- solve(f ac. cov) + t(flm[[l]]) 7o*7o v 
inv <- diagd / sv) - v 7o*7o solve (d) °h*°L t(v) 
inv <- calc.cov.mat(inv, 1 / tr) 

result <- new.envO 
result$spec.risk <- spec.risk 
result$fac.load <- fac.load 
result$fac.cov <- fac.cov 
result$cov.mat <- mod.mat 
result$inv.cov <- inv 
result <- as.list(result) 
return(result) 


C R Code: Optimizer with Constraints & Bounds 

In this appendix we give the R source code for the optimization algorithm with 
linear homogeneous constraints and position bounds we use in Section 6.5. This 
code is similar to the code for the bounded regression algorithm discussed in detail 
in (Kakushadze, 2015b) with one important difference, so our discussion here will 
be brief. The entry function is bopt.calc.opt (). The argsO of bopt.calc.opt() 
are: ret, which is the iV-vector of stock returns (for a given date); load, a matrix 
whose columns are the coefficients of the homogeneous constraints, so dim(load) [1] 
is N {e.g., if the sole constraint is the dollar neutrality constraint, then load is an 
X 1 matrix with unit elements); iuv.cov, which is the N x N inverse factor model 
covariance matrix T^-^; upper, which is the iV-vector of the upper bounds wf on the 
weights Wi (see below); lower, which is the iV-vector of the lower bounds w~ on the 
weights Wi] and prec, which is the desired precision with which the output weights 
Wi, the iV-vector of which bopt. calc. opt () returns, must satisfy the normaliza¬ 
tion condition |tCj| = 1. Here the weights are dehned as Wi = Hi/1 (the dollar 
holdings over the total investment level). See (Kakushadze, 2015b) for more detail. 

bopt.calc.opt <- fuuctiou (ret, load, iuv.cov, upper, lower, prec = le-5) 

{ 

X <- bopt.geu.Im(ret, load, iuv.cov) 

ret <- ret / sum(abs(x)) 


The analog of the line y <- t(load[Jt, ]) 7o*7o w.retfJt, ] below, reads y <- 
t(w.load[Jt, ]) 7o*7o retfJt, ] in (Kakushadze, 2015b), where the analog of iuv.cov is 
diagonal and both lines give the same result; however, for non-diagonal inv. cov here they do not. 
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repeat! 

X <- bopt.opt(ret, load, inv.cov, upper, lower) 
if(abs(sum(abs(x)) - 1) < prec) 
break 

ret <- ret / sum(abs(x)) 

} 

return(x) 

} 

bopt.gen.Im <- function (x, y, z) 

{ 

if(is.vector(z)) 
z <- diag(z) 
if(is.vector(y)) 

y <- matrix(y, length(y), 1) 
if(is.vector(x)) 

X <- matrix(x, length(x), 1) 
yl <- z y 

res <- (z - yl solve(t(y) yl) t(yl)) %*% x 
return(res) 


bopt.opt <- function (ret, load, inv.cov, upper, lower, tol = le-6) 

{ 

calc.bounds <- function(z, x) 

{ 

q <- X - z 

p <- rep(NA, length(x)) 
pp <- pmin(x, upper) 
pm <- pmax(x, lower) 
p[q > 0] <- pp[q > 0] 
p[q < 0] <- pm[q < 0] 
t <- (p - z)/q 
t <- min(t, na.rm = T) 
z <- z + t * q 
return(z) 

} 

if(!is.matrix(load)) 

load <- matrixdoad, length(load), 1) 
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n <- nrow(load) 
k <- ncol(load) 

ret <- matrix(ret, n, 1) 
upper <- matrix(upper, n, 1) 
lower <- matrixdower, n, 1) 
z <- inv.cov 
w.load <- z 7o*7o load 
w.ret <- z 7o*7o ret 

J <- rep(T, n) 

Jp <- rep(F, n) 

Jm <- rep(F, n) 
z <- rep(0, n) 

repeat! 

Jt <- J & !Jp & !Jm 
y <- t(load[Jt, ]) 7o*7o w.ret [Jt, ] 
if(sum(Jp) > 1) 

y <- y + t(load[Jp, ]) 7o*7o upper [Jp, ] 
else if(sum(Jp) == 1) 

y <- y + upper[Jp, ] * matrix(load[Jp, ], k, 1) 
if(sum(Jm) > 1) 

y <- y + t(load[Jm, ]) 7o*7o lower [Jm, ] 
else if(sum(Jm) == 1) 

y <- y + lower[Jm, ] * matrix(load[Jm, ], k, 1) 
if(k > 1) 

take <- colSums(abs(load[Jt, ])) >0 
else 

take <- T 

Q <- t(load[Jt, take]) 7o*7o w.load[Jt, take] 

Q <- solve(Q) 

V <- Q 7o*7o y[take] 

xJp <- Jp 
xJm <- Jm 

X <- w.ret - w.load[, take] 7o*7o v 
x[Jp, ] <- upper [Jp, ] 
x[Jm, ] <- lower [Jm, ] 

z <- calc.bounds(z, x) 

Jp <- abs(z - upper) < tol 
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Jm <- abs(z - lower) < tol 


if(all(Jp == xJp) & alKJm == xJm)) 
break 

} 


return(z) 

} 

D DISCLAIMERS 

Wherever the context so requires, the masculine gender includes the feminine and/or 
neuter, and the singular form includes the plural and vice versa. The author of this 
paper (“Author”) and his affiliates including without limitation Quantigic® Solu¬ 
tions LLC (“Author’s Affiliates” or “his Affiliates”) make no implied or express 
warranties or any other representations whatsoever, including without limitation 
implied warranties of merchantability and fitness for a particular purpose, in con¬ 
nection with or with regard to the content of this paper including without limitation 
any code or algorithms contained herein (“Content”). 

The reader may use the Content solely at his/her/its own risk and the reader 
shall have no claims whatsoever against the Author or his Affiliates and the Author 
and his Affiliates shall have no liability whatsoever to the reader or any third party 
whatsoever for any loss, expense, opportunity cost, damages or any other adverse 
effects whatsoever relating to or arising from the use of the Content by the reader 
including without any limitation whatsoever: any direct, indirect, incidental, spe¬ 
cial, consequential or any other damages incurred by the reader, however caused 
and under any theory of liability; any loss of profit (whether incurred directly or 
indirectly), any loss of goodwill or reputation, any loss of data suffered, cost of pro¬ 
curement of substitute goods or services, or any other tangible or intangible loss; 
any reliance placed by the reader on the completeness, accuracy or existence of the 
Content or any other effect of using the Content; and any and all other adversities 
or negative effects the reader might encounter in using the Content irrespective of 
whether the Author or his Affiliates is or are or should have been aware of such 
adversities or negative effects. 

The R code included in Appendix A, Appendix B and Appendix C hereof is part 
of the copyrighted R code of Quantigic® Solutions LLC and is provided herein with 
the express permission of Quantigic® Solutions LLC. The copyright owner retains all 
rights, title and interest in and to its copyrighted source code included in Appendix 
A, Appendix B and Appendix C hereof and any and all copyrights therefor. 



References 


Acharya, V.V. and Pedersen, L.H. (2005) Asset pricing with liquidity risk. 
Journal of Financial Economics 77(2): 375-410. 

Ang, A., Hodrick, R., Xing, Y. and Zhang, X. (2006) The Cross-Section of 
Volatility and Expected Returns. Journal of Finance 61(1): 259-299. 

Anson, M. (2013/14) Performance Measurement in Private Equity: Another 
Look at the Lagged Beta Effect. The Journal of Private Equity 17(1): 29-44. 

Asness, C.S. (1995) The Power of Past Stock Returns to Explain Future Stock 
Returns. Goldman Sachs Asset Management. Working paper. 

Asness, C. and Stevens, R. (1995) Intra- and Inter-Industry Variation in the 
Cross-Section of Expected Stock Returns. Goldman Sachs Asset Management. 
Working paper. 

Asness, C., Krail, R.J. and Liew, J.M. (2001) Do Hedge Funds Hedge? The 
Journal of Portfolio Management 28{1): 6-19. 

Bai, J. (2003) Inferential theory for factor models of large dimension. Econo- 
metrica 71(11): 135-171. 

Bai, J. and Li, K. (2012) Statistical Analysis of Factor Models of High Dimen¬ 
sion. The Annals of Statistics 4J1(1): 436-465. 

Bai, J. and Ng, S. (2002) Determining the number of factors in approximate 
factor models. Econometrica 70(1): 191-221. 

Bansal, R. and Viswanathan, S. (1993) No Arbitrage and Arbitrage Pricing: A 
New Approach. The Journal o/Fmance 48(4): 1231-1262. 

Banz, R. (1981) The relationship between return and market value of common 
stocks. Journal of Financial Economics 9(1): 3-18. 

Basu, S. (1977) The investment performance of common stocks in relation to 
their price to earnings ratios: A test of the efficient market hypothesis. Journal 
of Finance 32(3): 663-682. 

Black, F. (1972) Capital market equilibrium with restricted borrowing. Journal 
of Business 4:5(3): 444-455. 

Black, F., Jensen, M. and Scholes, M. (1972) The capital asset pricing model: 
Some empirical tests. In: Jensen, M. (ed.) Studies in the Theory of Capital 
Markets. New York, NY: Praeger Publishers, pp. 79-121. 


29 



Blume, O. and Friend, L. (1973) A new look at the capital asset pricing model. 
Journal of Finance 28{1): 19-33. 

Brandt, M.W., Brav, A., Graham, J.R. and Kumar, A. (2010) The idiosyncratic 
volatility puzzle: Time trend or speculative episodes? Review of Financial 
Studies 23{2): 863-899. 

Briner, B. and Connor, G. (2008) How much structure is best? A comparison 
of market model, factor model and unstructured equity covariance matrices. 
Journal of Risk 10(4), 3-30. 

Burmeister, E. and Wall, K.D. (1986) The arbitrage pricing theory and macroe¬ 
conomic factor measures. Financial Review 21{1): 1-20. 

Campbell, J. (1987) Stock returns and the term structure. Journal of Financial 
Economics 18(2): 373-399. 

Campbell, J.Y., Lettau, M., Malkiel, B.G. and Xu, Y. (2001) Have individual 
stocks become more volatile? An empirical exploration of idiosyncratic risk. 
Journal of Finance 56{!): 1-43. 

Campbell, J. and Shiller, R. (1988) The dividend-price ratio and expectations 
of future dividends and discount factors. Review of Financial Studies 1(3): 195- 
227. 

Carhart, M.M. (1997) Persistence in mutual fund performance. Journal of Fi¬ 
nance 52{l): 57-82. 

Chamberlain, G. and Rothschild, M. (1983) Arbitrage, Factor Structure, and 
Mean-Variance Analysis on Large Asset Markets. Econometrica 51(5): 1281- 
1304. 

Chan, K.C., Chen, N. and Hsieh, D. (1985) An Exploratory Investigation of 
the Firm Size Effect. Journal of Financial Economics 14(3): 451-471. 

Chen, N., Grundy, B. and Stambaugh, R.F. (1990) Changing Risk, Changing 
Risk Premiums, and Dividend Yield Effects. The Journal of Business 63(1): 
51-70. 

Chen, N., Roll, R. and Ross, S. (1986) Economic forces and the stock market. 
Journal of Business 59{3): 383-403. 

Chicheportiche, R. and Bouchaud, J.-P. (2014) A nested factor model for 
non-linear dependencies in stock returns. Quantitative Finance (forthcoming), 
DOLIO.1080/14697688.2014.994668. 

Cochrane, J.H. (2001) Asset Pricing. Princeton, NJ: Princeton University Press. 


30 



Connor, G. (1984) A unified beta pricing theory. Journal of Economic Theory 
34(1): 13-31. 

Connor, G. (1995) The Three Types of Factor Models: A Comparison of Their 
Explanatory Power. Financial Analysts Journal 51(3): 42-46. 

Connor, G. and Korajczyk, R. (1988) Risk and return in an equilibrium APT: 
Application of a new test methodology. Journal of Financial Economics 21(2): 
255-289. 

Connor, G. and Korajczyk, R. (1989) An intertemporal beta pricing model. 
Review of Financial Studies 2(3): 373-392. 

Connor, G. and Korajczyk, R. (2010) Factor Models in Portfolio and Asset 
Pricing Theory. In: Guerard Jr, J.B. (ed.) Handbook of Portfolio Construction: 
Contemporary Applications of Markowitz Techniques. New York, NY: Springer, 
pp. 401-418. 

Daniel, K. and Titman, S. (1997) Evidence on the Characteristics of Cross 
Sectional Variation in Stock Returns. Journal of Finance 52(1): 1-33. 

DeBondt, W. and Thaler, R. (1985) Does the stock market overreact? Journal 
of Finance 40(3): 739-805. 

Dhrymes, P.J., Friend, I. and Gultekin, N.B. (1984) A Critical Reexamination 
of the Empirical Evidence on the Arbitrage Pricing Theory. The Journal of 
Finance 39(2): 323-346. 

Ederington, L. and Guan, W. (2002) Is implied volatility an informationally 
efficient and effective predictor of future volatility? The Journal of Risk 4(3): 
29-46. 

Fama, E. and French, K. (1992) The cross-section of expected stock returns. 
Journal of Finance A7(2)■. 427-465. 

Fama, E.F. and French, K.R. (1993) Common risk factors in the returns on 
stocks and bonds. J. Financ. Econ. 33(1): 3-56. 

Fama, E. and French, K. (1996) Multifactor explanations for asset pricing 
anomalies. Journal o/Fmance 51(1): 55-94. 

Fama, E. and French, K. (2015) A Five-Factor Asset Pricing Model. Journal of 
Financial Economics (forthcoming), DOI: 10.1016/j.j£neco.2014.10.010. 

Fama, E.F. and MacBeth, J.D. (1973) Risk, Return and Equilibrium: Empirical 
Tests. Journal of Political Economy 81{3): 607-636. 


31 



Person, W. and Harvey, C. (1991) The variation in economic risk premiums. 
Journal of Political Economy 99{2): 385-415. 

Person, W. and Harvey, C. (1999) Conditioning variables and the cross section 
of stock returns. Journal of Finance 5A{A): 1325-1360. 

Porni, M., Hallin, M., Lippi, M. and Reichlin, L. (2000) The generalized dy¬ 
namic factor model; Identihcation and estimation. The Review of Economics 
and Statistics 82(4): 540-554. 

Porni, M., Hallin, M., Lippi, M. and Reichlin, L. (2005) The generalized dy¬ 
namic factor model: One-sided estimation and forecasting. Journal of the Amer¬ 
ican Statistical Association 100(471): 830-840. 

Porni, M. and Lippi, M. (2001). The generalized dynamic factor model: Repre¬ 
sentation theory. Econometric Theory 17(6): 1113-1141. 

Goyal, A., Perignon, C. and Villa, C. (2008) How common are common return 
factors across the NYSE and Nasdaq? Journal of Financial Economics 90(3): 
252-271. 

Goyal, A. and Santa-Clara, P. (2003) Idiosyncratic risk matters! Journal of 
Finance 58{3): 975-1007. 

Grinold, R.C. and Kahn, R.N. (2000) Active Portfolio Management. New York, 
NY: McGraw-Hill. 

Hall, A.D., Hwang, S. and Satchell, S.E. (2002) Using bayesian variable selec¬ 
tion methods to choose style factors in global stock return models. Journal of 
Banking and Finance 26(12): 2301-2325. 

Haugen, R.A. (1995) The New Finance: The Case Against Efficient Markets. 
Upper Saddle River, NJ: Prentice Hall. 

Heaton, J. and Lucas, D.J. (1999) Stock Prices and Pundamentals. NBER 
Macroeconomics Annual 14(1): 213-242. 

Heston, S.L. and Rouwenhorst, K.G. (1994) Does Industrial Structure Explain 
the Benehts of International Diversihcation? Journal of Financial Economics 
36(1): 3-27. 

Jagannathan, R. and Wang, Z. (1996) The conditional GAPM and the cross- 
section of expected returns. Journal of Finance 51(1): 3-53. 

Jegadeesh, N. and Titman, S. (1993) Returns to buying winners and selling 
losers: Implications for stock market efficiency. Journal of Finance 48(1): 65- 
91. 


32 



Jegadeesh, N. and Titman, S. (2001) Profitability of Momentum Strategies: An 
Evaluation of Alternative Explanations. Journal of Finance 56(2): 699-720. 

Kakushadze, Z. (2014) 4-Factor Model for Overnight Returns. Wilmott Maga¬ 
zine (forthcoming); http://ssrn.com/abstract=2511874 (October 19, 2014). 

Kakushadze, Z. (2015a) Mean-Reversion and Optimization. Journal of Asset 
Management 14-40. 

Kakushadze, Z. (2015b) Combining Alphas via Bounded Regression. SSRN 
Working Papers Series, http://ssrn.com/abstract=2550335 (January 15, 2015). 

Kakushadze, Z. (2015c) Russian-Doll Risk Models. Journal of Asset Manage¬ 
ment 16(3): 170-185. 

Kakushadze, Z. and Liew, J.K.-S. (2015) Custom v. Standardized Risk Models. 
Risks 3{2): 112-138. 

King, B.F. (1966) Market and Industry Factors in Stock Price Behavior. Journal 
of Business 139-190. 

Korajczyk, R.A. and Sadka, R. (2008) Pricing the Commonality Across Alter¬ 
native Measures of Liquidity. Journal of Financial Economics 87(1): 45-72. 

Kothari, S. and Shanken, J. (1997) Book-to-market, dividend yield and ex¬ 
pected market returns: A time series analysis. Journal of Financial Economics 
44(2): 169-203. 

Lakonishok, J., Shleifer, A. and Vishny, R.W. (1994) Contrarian Investment, 
Extrapolation, and Risk. The Journal of Finance 49(5): 1541-1578. 

Lee, J.-H. and Stefek, D. (2008) Do Risk Factors Eat Alphas? The Journal of 
Portfolio Management 34{4)\ 12-24. 

Lehmann, B. and Modest, D. (1988) The empirical foundations of the arbitrage 
pricing theory. Journal of Financial Economics 21(2): 213-254. 

Liew, J. and Vassalou, M. (2000) Can Book-to-Market, Size and Momentum be 
Risk Factors that Predict Economic Growth? Journal of Financial Economics 
57(2): 221-245. 

Lintner, J. (1965) The valuation of risky assets and the selection of risky in¬ 
vestments in stock portfolios and capital budgets. The Review of Economics 
and Statistics 47{!)■. 13-37. 

Lo, A.W. (2010) Hedge Eunds: An Analytic Perspective. Princeton, NJ: Prince¬ 
ton University Press. 


33 



Lo, A.W. and MacKinlay, A.C. (1990) Data-snooping biases in tests of financial 
asset pricing models. Review of Financial Studies 3(3): 431-468. 

MacKinlay, A.C. (1995) Multifactor models do not explain deviations from the 
CAPM. Journal of Financial Economics 38{1): 3-28. 

MacQueen, J. (2003) The structure of multifactor equity risk models. Journal 
of Asset Management 3(4) 313-322. 

Markowitz, H.M. (1952) Portfolio Selection. Journal of Finance 7{!): 77-91. 

Markowitz, H.M. (1984) The Two-Beta Trap. Journal of Portfolio Management 
11(1); 12-19. 

Menchero, J. and Mitra, I. (2008) The Structure of Hybrid Factor Models. 
Journal of Investment Management 6 (3): 35-47. 

Merton, R. (1973) An intertemporal capital asset pricing model. Econometrica 
41(5): 867-887. 

Miller, G. (2006) Needles, Haystacks, and Hidden Factors. The Journal of Port¬ 
folio Management 32{2): 25-32. 

Motta, G., Hafner, C. and von Sachs, R. (2011) Locally stationary factor mod¬ 
els: identihcation and nonparametric estimation. Econometric Theory 27(6): 
1279-1319. 

Mukherjee, D. and Mishra, A.K. (2005) Multifactor Capital Asset Pricing 
Model Under Alternative Distributional Specihcation. SSRN Working Papers 
Series, http://ssrn.com/abstract=871398 (December 29, 2005). 

Ng, V., Engle, R.F. and Rothschild, M. (1992) A multi-dynamic-factor model 
for stock returns. Journal of Econometrics 52(1-2): 245-266. 

Pastor, L. and Stambaugh, R.F. (2003) Liquidity Risk and Expected Stock 
Returns. The Journal of Political Economy 111(3): 642-685. 

Roll, R. and Ross, S.A. (1980) An Empirical Investigation of the Arbitrage 
Pricing Theory. Journal of Finance 35{5): 1073-1103. 

Rosenberg, B. (1974) Extra-Market Components of Covariance in Security Re¬ 
turns. Journal of Financial and Quantitative Analysis 9{2): 263-274. 

Ross, S.A. (1976) The arbitrage theory of capital asset pricing. Journal of Eco¬ 
nomic Theory 13(3): 341-360. 

Ross, S.A. (1978a) A Simple Approach to the Valuation of Risky Streams. 
Journal of Business 51{3): 453-475. 


34 



Ross, S.A. (1978b) Mutual Fund Separation in Financial Theory - The Sepa¬ 
rating Distributions. Journal of Economic Theory 17{2)\ 254-286. 

Scholes, M. and Williams, J. (1977) Estimating Betas from Nonsynchronous 
Data. Journal of Financial Economics 5{3): 309-327. 

Schwert, G. (1990) Stock returns and real activity: A century of evidence. 
Journal of Finance 45(4:): 1237-1257. 

Shanken, J. (1987) Nonsynchronous data and the covariance-factor structure of 
returns. Journal of Finance 42{2)\ 221-231. 

Shanken, J. (1990) Intertemporal Asset Pricing: An Empirical Investigation. 
Journal of Eeonometrics 45{l-2): 99-120. 

Shanken, J. and Weinstein, M.I. (2006) Economic Forces and the Stock Market 
Revisited. Journal of Empirical Finance 13(2): 129-144. 

Sharpe, W.F. (1963) A simplihed model for portfolio analysis. Management 
Sezence 9{2): 277-293. 

Sharpe, W. (1964) Capital asset prices: A theory of market equilibrium under 
conditions of risk. Journal of Finanee 19(3): 425-442. 

Stock, J.H. and Watson, M.W. (2002a) Macroeconomic forecasting using diffu¬ 
sion indexes. Journal of Business and Economie Statistics 20(2): 147-162. 

Stock, J.H. and Watson, M.W. (2002b) Forecasting using principal components 
from a large number of predictors. Journal of the American Statistical Associ- 
aUon 97(460): 1167-1179. 

Stroyny, A.L. (2005) Estimating a Combined Linear Factor Model. In: Knight, 
J. and Satchell, S.E. (eds.) Linear Factor Models in Finanee. Oxford: Elevier 
Butterworth-Heinemann. 

Treynor, J.L. (1999) Towards a Theory of Market Value of Risky Assets. In: 
Korajczyk, R. (ed.) Asset Prieing and Portfolio Performance: Models, Strategy, 
and Performanee Metrics. London: Risk Publications. 

Vassalou, M. (2003) News Related to Future GDP Growth as a Risk Factor in 
Equity Returns. Journal of Financial Economics. 68(1): 47-73. 

Whitelaw, R. (1997) Time variations and covariations in the expectation and 
volatility of stock market returns. Journal of Finance 49(2): 515-541. 

Zangari, P. (2003) Equity factor risk models. In: Litterman, B. (ed.) Modern In¬ 
vestment Management: An Equilibrium Approach. New York, NY: John Wiley 
& Sons, Inc., pp. 334-395. 


35 



Zhang, C. (2010) A Re-examination of the Causes of Time-varying Stock Return 
Volatilities. Journal of Financial and Quantitative Analysis 45iff) ■. 663-684. 


Table 1: First column: the number of principal components K] last column; g{K) 
defined in (30); Min, 1st Quartile, Median, Mean, 3rd Quartile and Max refer to 
the corresponding quantities for the ratio Qf = Qf /Cu (specihc variance over total 
variance). The number of observations (days) in the time series is M -|- 1 = 20. The 
number of (randomly selected) stocks is N = 2316. All quantities are rounded to 3 
digits. The value of K fixed via (29) is K = 12. See Figure 1 for a density plot. 


K 

Min 

1st Quartile 

Median 

Mean 

3rd Quartile 

Max 

'J{K) 

1 

0.16 

0.62 

0.824 

0.771 

0.953 

1 

0.4 

2 

0.137 

0.525 

0.693 

0.682 

0.867 

1 

0.37 

3 

0.084 

0.453 

0.629 

0.618 

0.8 

0.999 

0.29 

4 

0.075 

0.405 

0.562 

0.56 

0.718 

0.992 

0.27 

5 

0.06 

0.355 

0.501 

0.51 

0.668 

0.981 

0.235 

6 

0.06 

0.312 

0.449 

0.462 

0.606 

0.977 

0.233 

7 

0.057 

0.272 

0.396 

0.417 

0.552 

0.931 

0.203 

8 

0.033 

0.233 

0.347 

0.375 

0.503 

0.916 

0.139 

9 

0.029 

0.203 

0.306 

0.334 

0.446 

0.884 

0.111 

10 

0.019 

0.172 

0.264 

0.294 

0.39 

0.84 

0.056 

11 

0.01 

0.144 

0.227 

0.256 

0.339 

0.84 

0.018 

12 

0.009 

0.118 

0.194 

0.22 

0.294 

0.84 

0.013 

13 

0.008 

0.093 

0.157 

0.186 

0.25 

0.728 

0.056 

14 

0.002 

0.07 

0.122 

0.152 

0.204 

0.696 

0.12 

15 

0.002 

0.05 

0.089 

0.119 

0.162 

0.686 

0.128 

16 

0 

0.029 

0.062 

0.088 

0.122 

0.608 

0.211 

17 

0 

0.014 

0.035 

0.057 

0.077 

0.606 

0.221 

18 

0 

0.003 

0.011 

0.028 

0.034 

0.592 

0.231 
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Table 2: Simulation results for the weighted regression alphas discussed in Section 
6.4 and the optimized alphas discussed in Section 6.5, without any bounds on the 
dollar holdings. All quantities are rounded to 2 digits. See Figure 2 for P&L plots. 


Alpha 

ROC 

SR 

CPS 

Regression: Principal Components 

46.80% 

11.50 

2.05 

Optimization: Principal Components 

47.74% 

11.88 

2.26 

Regression: BICS Sub-Industries 

49.36% 

12.89 

2.16 

Regression: Heterotic Risk Factors 

51.89% 

13.63 

2.27 

Optimization: Heterotic Risk Model 

55.90% 

15.41 

2.67 


Table 3: Simulation results for the weighted regression alphas discussed in Section 
6.4 and the optimized alphas discussed in Section 6.5, with the liquidity bounds (89) 
on the dollar holdings. All quantities are rounded to 2 digits. See Figure 3 for P&L 
plots. 


Alpha 

ROC 

SR 

CPS 

Regression: Principal Components 

41.27% 

14.24 

1.84 

Optimization: Principal Components 

40.92% 

14.33 

1.96 

Regression: BICS Sub-Industries 

44.56% 

16.51 

1.97 

Regression: Heterotic Risk Factors 

46.86% 

18.30 

2.08 

Optimization: Heterotic Risk Model 

49.00% 

19.23 

2.36 
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Log of Specific Variance over Total Variance 


Figure 1. The density (computed using the R function density ()) for the log of the ratio 
= ^fjCii (specific variance over total variance) for the K = 12 case in Table 1. 
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Figure 2. P&L graphs for the intraday alphas without liquidity bounds summarized in 
Table 2. Bottom-to-top-performing: i) regression over 20 principal components (Section 
6.4), ii) optimization using the principal component risk model (Section 6.5), iii) regression 
over the BICS sub-industries (Section 6.4), iv) regression over the heterotic risk factor 
loadings (Section 6.4), and v) optimization using the heterotic risk model (Section 6.5). 
The investment level is $10M long plus $10M short. 
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Figure 3. P&L graphs for the intraday alphas with liquidity bounds summarized in Table 
3. Bottom-to-top-performing: i) optimization using the principal component risk model 
(Section 6.5), ii) regression over 20 principal components (Section 6.4), iii) regression 
over the BICS sub-industries (Section 6.4), iv) regression over the heterotic risk factor 
loadings (Section 6.4), and v) optimization using the heterotic risk model (Section 6.5). 
The investment level is $10M long plus $10M short. 
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